AU 1814 
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SEQUENCE LISTING 



(1) 



General Information: 



1 



(i) APPLICANT: CAPUT, DANIEL 

FERRARA, PASCUAL 
GUILLEMOT, JEAN-CLAUDE 
KAGHAD, MOURAD 
LEGOUX, RICHARD 
LOISON, GERARD 
LARBRE, ELIZABETH 
LUPKER, JOHANNES 
LEPLATOIS, PASCUAL 
SALOME, MARK 

(ii) TITLE OF INVENTION: URATE OXIDASE ACTIVITY PROTEIN, 

RECOMBINANT GENE CODING THEREFOR, EXPRESSION VECTOR, 
MICRO-ORGANISMS AND TRANSFORMED CELLS 

(iii) NUMBER OF SEQUENCES: 36 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Foley & Lardner 

(B) STREET: 1800 Diagonal Road, Suite 500 

(C) CITY: Alexandria 

(D) STATE: Virginia 

(E) COUNTRY: USA 

(F) ZIP: 22313-0299 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US/07/920,519 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US/07/659,408 

(B) FILING DATE: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: BENT, Stephen A. 

(B) REGISTRATION NUMBER: 29,768 

(C) REFERENCE/DOCKET NUMBER: 16781/276 BEDL 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (703)836-9300 
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52 (B) TELEFAX: (703)683-4109 

53 (C) TELEX: 899149 
54 

55 

56 (2) INFORMATION FOR SEQ ID NO:l: 
57 

58 (i) SEQUENCE CHARACTERISTICS: 

59 (A) LENGTH: 301 amino acids 

60 (B) TYPE: amino acid 

61 (D) TOPOLOGY: linear 
62 

63 (ii) MOLECULE TYPE: protein 

64 

65 (iii) HYPOTHETICAL: NO 

66 

67 (vi) ORIGINAL SOURCE: 

68 (A) ORGANISM: Aspergillus flavus 
69 

70 (vii) IMMEDIATE SOURCE: 

71 (B) CLONE: Urate oxidase 
72 

73 

74 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

75 

76 Ser Ala Val Lys Ala Ala Arg Tyr Gly Lys Asp Asn Val Arg Val Tyr 

77 1 5 10 15 
78 

79 Lys Val His Lys Asp Glu Lys Thr Gly Val Gin Thr Val Tyr Glu Met 

80 20 25 30 
81 

82 Thr Val Cys Val Leu Leu Glu Gly Glu He Glu Thr Ser Tyr Thr Lys 

83 35 40 45 
84 

85 Ala Asp Asn Ser Val He Val Ala Thr Asp Ser He Lys Asn Thr He 

86 50 55 60 
87 

88 Tyr He Thr Ala Lys Gin Asn Pro Val Thr Pro Pro Glu Leu Phe Gly 

89 65 70 75 80 
90 

91 Ser He Leu Gly Thr His Phe He Glu Lys Tyr Asn His He His Ala 

92 85 90 95 
93 

94 Ala His Val Asn He Val Cys His Arg Trp Thr Arg Met Asp He Asp 

95 100 105 110 
96 

97 Gly Lys Pro His Pro His Ser Phe He Arg Asp Ser Glu Glu Lys Arg 

98 * 115 120 125 
99 

100 Asn Val Gin Val Asp Val Val Glu Gly Lys Gly He Asp He Lys Ser 

101 130 ~ 135 140 
102 
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103 Ser Leu Ser Gly Leu Thr Val Leu Lys Ser Thr Asn Ser Gin Phe Trp 

104 145 150 155 160 
105 

106 Gly Phe Leu Arg Asp Glu Tyr Thr Thr Leu Lys Glu Thr Trp Asp Arg 

107 165 170 175 
108 

109 lie Leu Ser Thr Asp Val Asp Ala Thr Trp Gin Trp Lys Asn Phe Ser 

110 180 185 190 
111 

112 Gly Leu Gin Glu Val Arg Ser His Val Pro Lys Phe Asp Ala Thr Trp 

113 195 200 205 
114 

115 Ala Thr Ala Arg Glu Val Thr Leu Lys Thr Phe Ala Glu Asp Asn Ser 

116 210 215 220 
117 

118 Ala Ser Val Gin Ala Thr Met Tyr Lys Met Ala Glu Gin lie Leu Ala 

119 225 230 235 240 
120 

121 Arg Gin Gin Leu lie Glu Thr Val Glu Tyr Ser Leu Pro Asn Lys His 

122 245 250 255 
123 

124 Tyr Phe Glu lie Asp Leu Ser Trp His Lys Gly Leu Gin Asn Thr Gly 

125 " 260 265 270 
126 

127 Lys Asn Ala Glu Val Phe Ala Pro Gin Ser Asp Pro Asn Gly Leu lie 

128 275 280 285 
129 

130 Lys Cys Thr Val Gly Arg Ser Ser Leu Lys Ser Lys Leu 

131 290 295 300 
132 

133 (2) INFORMATION FOR SEQ ID NO: 2: 
134 

135 (i) SEQUENCE CHARACTERISTICS: 

136 (A) LENGTH: 302 amino acids 

137 (B) TYPE: amino acid 

138 (C) STRANDEDNESS: single 

139 (D) TOPOLOGY: linear 
140 

141 (ii) MOLECULE TYPE: protein 

142 

143 (iii) HYPOTHETICAL: NO 

144 

145 (vi) ORIGINAL SOURCE: 

146 (A) ORGANISM: Aspergillus flavus 
147 

148 (vii) IMMEDIATE SOURCE: 

149 (B) CLONE: Met-Urate oxidase 
150 

151 

152 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

153 
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204 



Met Ser Ala Val 
1 

Tyr Lys Val His 
20 

Met Thr Val Cys 
35 

Lys Ala Asp Asn 
50 

lie Tyr He Thr 
65 



Lys Ala Ala Arg Tyr Gly Lys Asp Asn Val Arg Val 
5 10 15 

Lys Asp Glu Lys Thr Gly Val Gin Thr Val Tyr Glu 
25 30 

Val Leu Leu Glu Gly Glu He Glu Thr Ser Tyr Thr 
40 45 

Ser Val He Val Ala Thr Asp Ser He Lys Asn Thr 
55 60 

Ala Lys Gin Asn Pro Val Thr Pro Pro Glu Leu Phe 
70 75 80 



Gly Ser He Leu Gly Thr His Phe He Glu Lys Tyr Asn His He His 
85 90 95 



Ala Ala His Val 
100 

Asp Gly Lys Pro 
115 

Arg Asn Val Gin 
130 

Ser Ser Leu Ser 
145 



Asn He Val Cys His Arg Trp Thr Arg Met Asp He 
105 110 

His Pro His Ser Phe He Arg Asp Ser Glu Glu Lys 
120 125 

Val Asp Val Val Glu Gly Lys Gly He Asp He Lys 
135 140 

Gly Leu Thr Val Leu Lys Ser Thr Asn Ser Gin Phe 
150 155 160 



Trp Gly Phe Leu Arg Asp Glu Tyr Thr Thr Leu Lys Glu Thr Trp Asp 
165 170 175 



Arg He Leu Ser 
180 

Ser Gly Leu Gin 
195 

Trp Ala Thr Ala 
210 

Ser Ala Ser Val 
225 



Thr Asp Val Asp Ala Thr Trp Gin Trp Lys Asn Phe 
185 190 

Glu Val Arg Ser His Val Pro Lys Phe Asp Ala Thr 
200 205 

Arg Glu Val Thr Leu Lys Thr Phe Ala Glu Asp Asn 
215 220 

Gin Ala Thr Met Tyr Lys Met Ala Glu Gin He Leu 
230 235 240 



Ala Arg Gin Gin Leu He Glu Thr Val Glu Tyr Ser Leu Pro Asn Lys 
245 250 255 



His Tyr Phe Glu 
260 



He Asp Leu Ser Trp His Lys Gly Leu Gin Asn Thr 
265 270 
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205 Gly Lys Asn Ala Glu Val Phe Ala Pro Gin Ser Asp Pro Asn Gly Leu 

206 ' * 275 280 285 
207 

208 lie Lys Cys Thr Val Gly Arg Ser Ser Leu Lys Ser Lys Leu 

209 290 295 300 
210 

211 (2) INFORMATION FOR SEQ ID NO: 3: 
212 

213 (i) SEQUENCE CHARACTERISTICS: 

214 (A) LENGTH: 906 base pairs 

215 (B) TYPE: nucleic acid 

216 (C) STRANDEDNESS: single 

217 (D) TOPOLOGY: linear 
218 

219 (ii) MOLECULE TYPE: DNA (genomic) 

220 

221 

222 (vii) IMMEDIATE SOURCE: 

223 (B) CLONE: Preferred sequence for expression in 

224 prokaryotes 
225 

226 

227 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

228 

229 ATGTCTGCGG TAAAAGCAGC GCGCTACGGC AAGGACAATG TTCGCGTCTA CAAGGTTCAC 60 
230 

231 AAGGACGAGA AGACCGGTGT CCAGACGGTG TACGAGATGA CCGTCTGTGT GCTTCTGGAG 120 
232 

233 GGTGAGATTG AGACCTCTTA CACCAAGGCC GACAACAGCG TCATTGTCGC AACCGACTCC 180 
234 

235 ATTAAGAACA CCATTTACAT CACCGCCAAG CAGAACCCCG TTACTCCTCC CGAGCTGTTC 240 
236 

237 GGCTCCATCC TGGGCACACA CTTCATTGAG AAGTACAACC ACATCCATGC CGCTCACGTC 300 
238 

239 AACATTGTCT GCCACCGCTG GACCCGGATG GACATTGACG GCAAGCCACA CCCTCACTCC 360 
240 

241 TTCATCCGCG ACAGCGAGGA GAAGCGGAAT GTGCAGGTGG ACGTGGTCGA GGGCAAGGGC 420 
242 

243 ATCGATATCA AGTCGTCTCT GTCCGGCCTG ACCGTGCTGA AGAGCACCAA CTCGCAGTTC 480 
244 

245 TGGGGCTTCC TGCGTGACGA GTACACCACA CTTAAGGAGA CCTGGGACCG TATCCTGAGC 540 
246 

247 ACCGACGTCG ATGCCACTTG GCAGTGGAAG AATTTCAGTG GACTCCAGGA GGTCCGCTCG 600 
248 

249 CACGTGCCTA AGTTCGATGC TACCTGGGCC ACTGCTCGCG AGGTCACTCT GAAGACTTTT 660 
250 

251 GCTGAAGATA ACAGTGCCAG CGTGCAGGCC ACTATGTACA AGATGGCAGA GCAAATCCTG 720 
252 

253 GCGCGCCAGC AGCTGATCGA GACTGTCGAG TACTCGTTGC CTAACAAGCA CTATTTCGAA 780 
254 

255 ATCGACCTGA GCTGGCACAA GGGCCTCCAA AACACCGGCA AGAACGCCGA GGTCTTCGCT 840 
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256 

257 CCTCAGTCGG ACCCCAACGG TCTGATCAAG TGTACCGTCG GCCGGTCCTC TCTGAAGTCT 900 
258 

259 AAATTG 906 
260 

261 (2) INFORMATION FOR SEQ ID NO:4: 
262 

263 (i) SEQUENCE CHARACTERISTICS: 

264 (A) LENGTH: 906 base pairs 

265 (B) TYPE: nucleic acid 

266 (C) STRANDEDNESS: single 

267 (D) TOPOLOGY: linear 
268 

269 (ii) MOLECULE TYPE: DNA (genomic) 

270 

271 

272 (vii) IMMEDIATE SOURCE: 

273 (B) CLONE: Preferred sequence for expression in 

274 eukaryotes 
275 

276 

277 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

278 

279 ATGTCTGCTG TTAAGGCTGC TAGATACGGT AAGGACAACG TTAGAGTCTA CAAGGTTCAC 60 
280 

281 AAGGACGAGA AGACCGGTGT CCAGACGGTG TACGAGATGA CCGTCTGTGT GCTTCTGGAG 120 
282 

283 GGTGAGATTG AGACCTCTTA CACCAAGGCC GACAACAGCG TCATTGTCGC AACCGACTCC 180 
284 

285 ATTAAGAACA CCATTTACAT CACCGCCAAG CAGAACCCCG TTACTCCTCC CGAGCTGTTC 240 
286 

287 GGCTCCATCC TGGGCACACA CTTCATTGAG AAGTACAACC ACATCCATGC CGCTCACGTC 300 
288 

289 AACATTGTCT GCCACCGCTG GACCCGGATG GACATTGACG GCAAGCCACA CCCTCACTCC 360 
290 

291 TTCATCCGCG ACAGCGAGGA GAAGCGGAAT GTGCAGGTGG ACGTGGTCGA GGGCAAGGGC 420 
292 

293 ATCGATATCA AGTCGTCTCT GTCCGGCCTG ACCGTGCTGA AGAGCACCAA CTCGCAGTTC 480 
294 

295 TGGGGCTTCC TGCGTGACGA GTACACCACA CTTAAGGAGA CCTGGGACCG TATCCTGAGC 540 
296 

297 ACCGACGTCG ATGCCACTTG GCAGTGGAAG AATTTCAGTG GACTCCAGGA GGTCCGCTCG 600 
298 

299 CACGTGCCTA AGTTCGATGC TACCTGGGCC ACTGCTCGCG AGGTCACTCT GAAGACTTTT 660 
300 

301 GCTGAAGATA ACAGTGCCAG CGTGCAGGCC ACTATGTACA AGATGGCAGA GCAAATCCTG 720 
302 

303 GCGCGCCAGC AGCTGATCGA GACTGTCGAG TACTCGTTGC CTAACAAGCA CTATTTCGAA 780 
304 

305 ATCGACCTGA GCTGGCACAA GGGCCTCCAA AACACCGGCA AGAACGCCGA GGTCTTCGCT 840 
306 
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307 CCTCAGTCGG ACCCCAACGG TCTGATCAAG TGTACCGTCG GCCGGTCCTC TCTGAAGTCT 900 
308 

309 AAATTG 906 
310 

311 (2) INFORMATION FOR SEQ ID NO: 5: 
312 

313 (i) SEQUENCE CHARACTERISTICS: 

314 (A) LENGTH: 14 base pairs 

315 (B) TYPE: nucleic acid 

316 (C) STRANDEDNESS: single 

317 (D) TOPOLOGY: linear 
318 

319 (ii) MOLECULE TYPE: DNA (genomic) 

320 

321 (iii) HYPOTHETICAL: NO 

322 

323 

324 (vii) IMMEDIATE SOURCE: 

325 (B) CLONE: Preferred non-translated 5' sequence for 

326 animal cells 
327 

328 

329 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

330 

331 AGCTTGCCGC CACT 14 
332 

333 (2) INFORMATION FOR SEQ ID NO:6: 
334 

335 (i) SEQUENCE CHARACTERISTICS: 

336 (A) LENGTH: 906 base pairs 

337 (B) TYPE: nucleic acid 

338 (C) STRANDEDNESS: double 

339 (D) TOPOLOGY: linear 
340 

341 (ii) MOLECULE TYPE: DNA (genomic) 

342 

343 (iii) HYPOTHETICAL: NO 

344 

345 

346 (vii) IMMEDIATE SOURCE: 

347 (B) CLONE: Preferred sequence for expression in animal 

348 cells 
349 

350 

351 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

352 

353 ATGTCCGCAG TAAAAGCAGC CCGCTACGGC AAGGACAATG TCCGCGTCTA CAAGGTTCAC 60 
354 

355 AAGGACGAGA AGACCGGTGT CCAGACGGTG TACGAGATGA CCGTCTGTGT GCTTCTGGAG 120 
356 

357 GGTGAGATTG AGACCTCTTA CACCAAGGCC GACAACAGCG TCATTGTCGC AACCGACTCC 180 
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358 

359 ATTAAGAACA CCATTTACAT CACCGCCAAG CAGAACCCCG TTACTCCTCC CGAGCTGTTC 240 
360 

361 GGCTCCATCC TGGGCACACA CTTCATTGAG AAGTACAACC ACATCCATGC CGCTCACGTC 300 
362 

363 AACATTGTCT GCCACCGCTG GACCCGGATG GACATTGACG GCAAGCCACA CCCTCACTCC 360 
364 

365 TTCATCCGCG ACAGCGAGGA GAAGCGGAAT GTGCAGGTGG ACGTGGTCGA GGGCAAGGGC 420 
366 

367 ATCGATATCA AGTCGTCTCT GTCCGGCCTG ACCGTGCTGA AGAGCACCAA CTCGCAGTTC 480 

368' 

369 TGGGGCTTCC TGCGTGACGA GTACACCACA CTTAAGGAGA CCTGGGACCG TATCCTGAGC 540 
370 

371 ACCGACGTCG ATGCCACTTG GCAGTGGAAG AATTTCAGTG GACTCCAGGA GGTCCGCTCG 600 
372 

373 CACGTGCCTA AGTTCGATGC TACCTGGGCC ACTGCTCGCG AGGTCACTCT GAAGACTTTT 660 
374 

375 GCTGAAGATA ACAGTGCCAG CGTGCAGGCC ACTATGTACA AGATGGCAGA GCAAATCCTG 720 
376 

377 GCGCGCCAGC AGCTGATCGA GACTGTCGAG TACTCGTTGC CTAACAAGCA CTATTTCGAA 780 
378 

379 ATCGACCTGA GCTGGCACAA GGGCCTCCAA AACACCGGCA AGAACGCCGA GGTCTTCGCT 840 
380 

381 CCTCAGTCGG ACCCCAACGG TCTGATCAAG TGTACCGTCG GCCGGTCCTC TCTGAAGTCT 900 
382 

383 AAATTG 906 
384 

385 (2) INFORMATION FOR SEQ ID NO: 7: 
386 

387 (i) SEQUENCE CHARACTERISTICS: 

388 (A) LENGTH: 23 base pairs 

389 (B) TYPE: nucleic acid 

390 (C) STRANDEDNESS: single 

391 (D) TOPOLOGY: linear 
392 

393 (ii) MOLECULE TYPE: DNA (genomic) 

394 

395 (iii) HYPOTHETICAL: NO 

396 

397 

398 (vii) IMMEDIATE SOURCE: 

399 (B) CLONE: reverse transcription primer 
400 

401 

402 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

403 

404 GATCCGGGCC CTTTTTTTTT TTT 23 
405 

406 (2) INFORMATION FOR SEQ ID NO: 8: 
407 

408 (i) SEQUENCE CHARACTERISTICS: 
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409 (A) LENGTH: 10 amino acids 

410 (B) TYPE: amino acid 

411 (C) STRANDEDNESS: single 

412 (D) TOPOLOGY: linear 
413 

414 (ii) MOLECULE TYPE: peptide 

415 

416 (iii) HYPOTHETICAL: NO 

417 

418 

419 (vii) IMMEDIATE SOURCE: 

420 (B) CLONE: Hydrolysis product T 17 
421 

422 

423 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

424 

425 Asn Val Gin Val Asp Val Val Glu Gly Lys 

426 15 10 
427 

428 (2) INFORMATION FOR SEQ ID NO: 9: 
429 

430 (i) SEQUENCE CHARACTERISTICS: 

431 (A) LENGTH: 8 amino acids 

432 (B) TYPE: amino acid 

433 (C) STRANDEDNESS: single 

434 (D) TOPOLOGY: linear 
435 

436 (ii) MOLECULE TYPE: peptide 

437 

438 (iii) HYPOTHETICAL: NO 

439 

440 

441 (vii) IMMEDIATE SOURCE: 

442 (B) CLONE: Hydrolysis product T 20 
443 

444 

445 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

446 

447 Asn Phe Ser Gly Leu Gin Glu Val 

448 1 * 5 
449 

450 (2) INFORMATION FOR SEQ ID NO: 10: 
451 

452 (i) SEQUENCE CHARACTERISTICS: 

453 (A) LENGTH: 6 amino acids 

454 (B) TYPE: amino acid 

455 (C) STRANDEDNESS: single 

456 (D) TOPOLOGY: linear 
457 

458 (ii) MOLECULE TYPE: peptide 

459 
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(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hydrolysis product T 23 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Phe Asp Ala Thr Trp Ala 
1 * 5 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hydrolysis product T 27 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

His Tyr Phe Glu lie Asp Leu Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hydrolysis product T 28 
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511 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

512 

513 lie Leu Ser Thr Asp Val Asp Ala Thr Trp Gin Trp Lys 

514 15 10 
515 

516 (2) INFORMATION FOR SEQ ID NO: 13: 
517 

518 (i) SEQUENCE CHARACTERISTICS: 

519 (A) LENGTH: 11 amino acids 

520 (B) TYPE: amino acid 

521 (C) STRANDEDNESS: single 

522 (D) TOPOLOGY: linear 
523 

524 (ii) MOLECULE TYPE: peptide 

525 

526 (iii) HYPOTHETICAL: NO 

527 

528 

529 (vii) IMMEDIATE SOURCE: 

530 (B) CLONE: Hydrolysis product T 29 
531 

532 

533 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

534 

535 His Tyr Phe Glu lie Asp Leu Ser Trp His Lys 

536 1 5 10 
537 

538 (2) INFORMATION FOR SEQ ID NO: 14: 
539 

540 (i) SEQUENCE CHARACTERISTICS: 

541 (A) LENGTH: 11 amino acids 

542 (B) TYPE: amino acid 

543 (C) STRANDEDNESS: single 

544 (D) TOPOLOGY: linear 
545 

546 (ii) MOLECULE TYPE: peptide 

547 

548 (iii) HYPOTHETICAL: NO 

549 

550 

551 (vii) IMMEDIATE SOURCE: 

552 (B) CLONE: Hydrolysis product T 31 
553 

554 

555 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

556 

557 Ser Thr Asn Ser Gin Phe Trp Gly Phe Leu Arg 

558 1 5 10 
559 

560 (2) INFORMATION FOR SEQ ID NO: 15: 
561 
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562 (i) SEQUENCE CHARACTERISTICS: 

563 (A) LENGTH: 16 amino acids 

564 (B) TYPE: amino acid 

565 (C) STRANDEDNESS: single 

566 (D) TOPOLOGY: linear 
567 

568 (ii) MOLECULE TYPE: peptide 

569 

570 (iii) HYPOTHETICAL: NO 

571 

572 

573 (vii) IMMEDIATE SOURCE: 

574 (B) CLONE: Hydrolysis product T 32 
575 

576 

577 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

578 

579 Gin Asn Pro Val Thr Pro Pro Glu Leu Phe Gly Ser He Leu Gly Thr 

580 15 10 15 
581 

582 

583 (2) INFORMATION FOR SEQ ID NO: 16: 
584 

585 (i) SEQUENCE CHARACTERISTICS: 



586 (A) LENGTH: 16 amino acids 

587 (B) TYPE: amino acid 

588 (C) STRANDEDNESS: single 

589 (D) TOPOLOGY: linear 
590 



591 (ii) MOLECULE TYPE: peptide 

592 

593 (iii) HYPOTHETICAL: NO 

594 

595 

596 (vii) IMMEDIATE SOURCE: 

597 (B) CLONE: Hydrolysis product T 33 
598 

599 

600 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

601 

602 Gin Asn Pro Val Thr Pro Pro Glu Leu Phe Gly Ser He Leu Gly Thr 

603 15 10 15 
604 

605 

606 (2) INFORMATION FOR SEQ ID NO: 17: 
607 

608 (i) SEQUENCE CHARACTERISTICS: 



609 (A) LENGTH: 17 amino acids 

610 (B) TYPE: amino acid 

611 (C) STRANDEDNESS: single 

612 (D) TOPOLOGY: linear 
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613 

614 (ii) MOLECULE TYPE: peptide 

615 

616 (iii) HYPOTHETICAL: NO 

617 

618 

619 (vii) IMMEDIATE SOURCE: 

620 (B) CLONE: Hydrolysis product V 1 
621 

622 

623 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

624 

625 Tyr Ser Leu Pro Asn Lys His Tyr Phe Glu lie Asp Leu Ser Trp His 

626 15 10 15 
627 

628 Lys 

629 

630 

631 (2) INFORMATION FOR SEQ ID NO:18: 
632 

633 (i) SEQUENCE CHARACTERISTICS: 

634 (A) LENGTH: 16 amino acids 

635 (B) TYPE: amino acid 

636 (C) STRANDEDNESS: single 

637 (D) TOPOLOGY: linear 
638 

639 (ii) MOLECULE TYPE: peptide 

640 

641 (iii) HYPOTHETICAL: NO 

642 

643 

644 (vii) IMMEDIATE SOURCE: 

645 (B) CLONE: Hydrolysis product V 2 
646 

647 

648 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

649 

650 Val Thr Leu Lys Thr Phe Ala Glu Asp Asn Ser Ala Ser Val Gin Ala 

651 1 * 5 10 15 
652 

653 

654 (2) INFORMATION FOR SEQ ID NO: 19: 
655 

656 (i) SEQUENCE CHARACTERISTICS: 

657 (A) LENGTH: 24 amino acids 

658 (B) TYPE: amino acid 

659 (C) STRANDEDNESS: single 

660 (D) TOPOLOGY: linear 
661 

662 (ii) MOLECULE TYPE: peptide 

663 
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664 (iii) HYPOTHETICAL: NO 

665 

666 

667 (vii) IMMEDIATE SOURCE: 

668 (B) CLONE: Hydrolysis product V 3 
669 

670 

671 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

672 

673 Thr Ser Tyr Thr Lys Ala Asp Asn Ser Val He Val Asp Thr Asp Ser 

674 1 5 10 15 
675 

676 He Lys Asn Thr He Tyr He Thr 

677 20 
678 

679 (2) INFORMATION FOR SEQ ID NO: 20: 
680 

681 (i) SEQUENCE CHARACTERISTICS: 

682 (A) LENGTH: 28 amino acids 

683 (B) TYPE: amino acid 

684 (C) STRANDEDNESS: single 

685 (D) TOPOLOGY: linear 
686 

687 (ii) MOLECULE TYPE: peptide 

688 

689 (iii) HYPOTHETICAL: NO 

690 

691 

692 (vii) IMMEDIATE SOURCE: 

693 (B) CLONE: Hydrolysis product V 5 
694 

695 

696 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

697 

698 Gly Lys Gly He Asp He Lys Ser Ser Leu Ser Gly Leu Thr Val Leu 

699 1 5 10 15 
700 

701 Lys Ser Thr Asn Ser Gin Phe Trp Gly Phe Leu Arg 

702 20 25 
703 

704 (2) INFORMATION FOR SEQ ID NO: 21: 
705 

706 (i) SEQUENCE CHARACTERISTICS: 

707 (A) LENGTH: 17 amino acids 

708 (B) TYPE: amino acid 

709 (C) STRANDEDNESS: single 

710 (D) TOPOLOGY: linear 
711 

712 (ii) MOLECULE TYPE: peptide 

713 

714 (iii) HYPOTHETICAL: NO 
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715 
716 

717 (vii) IMMEDIATE SOURCE: 

718 (B) CLONE: Hydolysis product V 6 
719 

720 

721 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

722 

723 Gly Lys Gly lie Asp lie Lys Ser Ser Leu Ser Gly Leu Thr Val Leu 

724 15 10 15 
725 

726 Lys 

727 

728 

729 (2) INFORMATION FOR SEQ ID NO: 22: 
730 

731 (i) SEQUENCE CHARACTERISTICS: 

732 (A) LENGTH: 1236 base pairs 

733 (B) TYPE: nucleic acid 

734 (C) STRANDEDNESS: single 

735 (D) TOPOLOGY: linear 
736 

737 (ii) MOLECULE TYPE: DNA (genomic) 

738 

739 (iii) HYPOTHETICAL: NO 

740 

741 

742 (vii) IMMEDIATE SOURCE: 

743 (B) CLONE: Fragment 3 
744 

745 

746 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

747 

748 GATCCGCGGA AGCATAAAGT GTAAAGCCTG GGGTGCCTAA TGAGTGAGCT AACTTACATT 60 
749 

750 AATTGCGTTG CGCTCACTGC CCGCTTTCCA GTCGGGAAAC CTGTCGTGCC AGCTGCATTA 120 
751 

7 52 ATGAATCGGC CAACGCGCGG GGAGAGGCGG TTTGCGTATT GGGCGCCAGG GTGGTTTTTC 180 
753 

754 TTTTCACCAG TGAGACGGGC AACAGCTGAT TGCCCTTCAC CGCCTGGCCC TGAGAGAGTT 240 
755 

756 GCAGCAAGCG GTCCACGCTG GTTTGCCCCA CCACCCGAAA ATCCTGTTTG ATGGTGGTTA 300 
757 

758 ACGGCGGGAT ATAACATGAG CTGTCTTCGG TATCGTCGTA TCCCACTACC GAGATATCCG 360 
759 

760 CACCAACGCG CAGCCCGGAC TCGGTAATGG CGCGCATTGC GCCCAGCGCC ATCTGATCGT 420 
761 

762 TGGCAACCAG CATCGCAGTG GGAACGATGC CCTCATTCAG CATTTGCATG GTTTGTTGAA 480 
763 

764 AACCGGACAT GGCACTCCAG TCGCCTTCCC GTTCCGCTAT CGGCTGAATT TGATTGCGAG 540 
765 
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766 
767 
768 
769 
770 
771 
772 
773 
774 
775 
776 
777 
778 
779 
780 
781 
782 
783 
784 
785 
786 
787 
788 
789 
790 
791 
792 
793 
794 
795 
796 
797 
798 
799 
800 
801 
802 
803 
804 
805 
806 
807 
808 
809 
810 
811 
812 
813 
814 
815 
816 



GACAGAACTT 
CTCCACGCCC 
GTCAGAGACA 
ATCCTGGTCA 
GTGCACCGCC 
GGCACCCAGT 
GGCCAGACTG 
CACGCGGTTG 
CGCAGAAACG 
ATACTCTGCG 
TTCCGGGCGC 



AATGGGCCCG 
AGTCGCGTAC 
TCAAGAAATA 
TCCAGCGGAT 
GCTTTACAGG 
TGATCGGCGC 
GAGGTGGCAA 
GGAATGTAAT 
TGGCTGGCCT 
ACATCGTATA 
TATCATGCCA 



TGAGATATTT ATGCCAGCCA GCCAGACGCA GACGCGCCGA 

CTAACAGCGC GATTTGCTGG TGACCCAATG CGACCAGATG 

CGTCTTCATG GGAGAAAATA ATACTGTTGA TGGGTGTCTG 

ACGCCGGAAC ATTAGTGCAG GCAGCTTCCA CAGCAATGGC 

AGTTAATGAT CAGCCCACTG ACGCGTTGCG CGAGAAGATT 

CTTCGACGCC GCTTCGTTCT ACCATCGACA CCACCACGCT 

GAGATTTAAT CGCCGCGACA ATTTGCGACG GCGCGTGCAG 

CGCCAATCAG CAACGACTGT TTGCCCGCCA GTTGTTGTGC 

TCAGCTCCGC CATCGCCGCT TCCACTTTTT CCCGCGTTTT 

GGTTCACCAC GCGGGAAACG GTCTGATAAC AGACACCGGC 

ACGTTACTGG TTTCACATTC ACCACCCTGA ATTGACTCTC 

TACCGCGAAA GGTTTTGCGC CATTCGATGG TGTCCG 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Fragment 4 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 107.. 316 

(D) OTHER INFORMATION: /product= "regulatory signal + aa 
1-44 human growth hormone precursor" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
TCGAGCTGAC TGACCTGTTG CTTATATTAC ATCGATAGCG TATAATGTGT GGAATTGTGA 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1236 



60 
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817 GCGATAACAA TTTCACACAG TTTAACTTTA AGAAGGAGAT ATACAT ATG GCT ACC 115 

818 Met Ala Thr 

819 1 
820 

821 GGA TCC CGG ACT AGT CTG CTC CTG GCT TTT GGC CTG CTC TGC CTG CCC 163 

822 Gly Ser Arg Thr Ser Leu Leu Leu Ala Phe Gly Leu Leu Cys Leu Pro 

823 5 10 15 
824 

825 TGG CTT CAA GAG GGC AGT GCC TTC CCA ACC ATT CCC TTA TCT AGA CTT 211 

826 Trp Leu Gin Glu Gly Ser Ala Phe Pro Thr lie Pro Leu Ser Arg Leu 

827 20 25 30 35 
828 

829 TTT GAC AAC GCT ATG CTC CGC GCC CAT CGT CTG CAC CAG CTG GCC TTT 259 

830 Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin Leu Ala Phe 

831 40 45 50 
832 

833 GAC ACC TAC CAG GAG TTT GAA GAA GCC TAT ATC CCA AAG GAA CAG AAG 307 

834 Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys Glu Gin Lys 

835 55 60 65 
836 

837 TAT TCA TTC CTGCA 321 

838 Tyr Ser Phe 

839 " 70 
840 

841 

842 (2) INFORMATION FOR SEQ ID NO: 24: 
843 

844 (i) SEQUENCE CHARACTERISTICS: 

845 (A) LENGTH: 70 amino acids 

846 (B) TYPE: amino acid 

847 (D) TOPOLOGY: linear 
848 

849 (ii) MOLECULE TYPE: protein 

850 

851 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

852 

853 Met Ala Thr Gly Ser Arg Thr Ser Leu Leu Leu Ala Phe Gly Leu Leu 

854 1 5 10 15 
855 

856 Cys Leu Pro Trp Leu Gin Glu Gly Ser Ala Phe Pro Thr lie Pro Leu 

857 20 * 25 30 
858 

859 Ser Arg Leu Phe Asp Asn Ala Met Leu Arg Ala His Arg Leu His Gin 

860 35 40 45 
861 

862 Leu Ala Phe Asp Thr Tyr Gin Glu Phe Glu Glu Ala Tyr lie Pro Lys 

863 50 " 55 60 
864 

865 Glu Gin Lys Tyr Ser Phe 

866 65 70 
867 
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868 (2) INFORMATION FOR SEQ ID NO:25: 
869 

870 (i) SEQUENCE CHARACTERISTICS: 

871 (A) LENGTH: 74 base pairs 

872 (B) TYPE: nucleic acid 

873 (C) STRANDEDNESS: double 

874 (D) TOPOLOGY: linear 
875 

876 (ii) MOLECULE TYPE: DNA (genomic) 

877 

878 (iii) HYPOTHETICAL: NO 

879 

880 

881 (vii) IMMEDIATE SOURCE: 

882 (B) CLONE: Clal-Ndel fragment 
883 

884 

885 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

886 

887 CGATAGCGTA TAATGTGTGG AATTGTGAGC GGATAACAAT TTCACACAGT TTTTCGCGAA 60 
888 

889 GAAGGAGATA TACA 74 
890 

891 (2) INFORMATION FOR SEQ ID NO:26: 
892 

893 (i) SEQUENCE CHARACTERISTICS: 

894 (A) LENGTH: 190 base pairs 

895 (B) TYPE: nucleic acid 

896 (C) STRANDEDNESS: double 

897 (D) TOPOLOGY: linear 
898 

899 (ii) MOLECULE TYPE: DNA (genomic) 

900 

901 (iii) HYPOTHETICAL: NO 

902 

903 

904 (vii) IMMEDIATE SOURCE: 

905 (B) CLONE: Plasmid p373,2 fragment 
906 

907 

908 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

909 

910 GATCTTCAAG CAGACCTACA GCAAGTTCGA CACAAACTCA CACAACGATG ACGCACTACT 60 
911 

912 CAAGAACTAC GGGCTGCTCT ACTGCTTCAG GAAGGACATG GACAAGGTCG AGACATTCCT 120 
913 

914 GCGCATCGTG CAGTGCCGCT CTGTGGAGGG CAGCTGTGGC TTCTAGTAAG GTACCCTGCC 180 
915 

916 CTACGTACCA 190 
917 

918 (2) INFORMATION FOR SEQ ID NO: 27: 
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919 
920 
921 
922 
923 
924 
925 
926 
927 
928 
929 
930 
931 
932 
933 
934 
935 
936 
937 
938 
939 
940 
941 
942 
943 
944 
945 
946 
947 
948 
949 
950 
951 
952 
953 
954 
955 
956 
957 
958 
959 
960 
961 
962 
963 
964 
965 
966 
967 
968 
969 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Accl-Ndel synthetic fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TATGTCTGCG GTAAAAGCAG CGCGCTACGG CAAGGACAAT GTTCGCGT 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 360 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Plasmid pEMR469 fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

GGGACGCGTC TCCTCTGCCG GAACACCGGG CATCTCCAAC TTATAAGTTG GAGAAATAAG 

AGAATTTCAG ATTGAGAGAA TGAAAAAAAA AAAAAAAAAA AAGGCAGAGG AGAGCATAGA 

AATGGGGTTC ACTTTTTGGT AAAGCTATAG CATGCCTATC ACATATAAAT AGAGTGCCAG 

TAGCGACTTT TTTCACACTC GAGATACTCT TACTACTGCT CTCTTGTTGT TTTTATCACT 

TCTTGTTTCT TCTTGGTAAA TAGAATATCA AGCTACAAAA AGCATACAAT CAACTATCAA 

CTATTAACTA TATCGATACC ATATGGATCC GTCGACTCTA GAGGATCGTC GACTCTAGAG 



48 



60 
120 
180 
240 
300 
360 
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970 
971 
972 
973 
974 
975 
976 
977 
978 
979 
980 
981 
982 
983 
984 
985 
986 
987 
988 
989 
990 
991 
992 
993 
994 
995 
996 
997 
998 
999 
1000 
1001 
1002 
1003 
1004 
1005 
1006 
1007 
1008 
1009 
1010 
1011 
1012 
1013 
1014 
1015 
1016 
1017 
1018 
1019 
1020 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Fragment C 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CGATATACAC AATGTCTGCT GTTAAGGCTG CTAGATACGG TAAGGACAAC GTTAGAGT 58 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1013 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Fragment D 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

CTACAAGGTT CACAAGGACG AGAAGACCGG TGTCCAGACG GTGTACGAGA TGACCGTCTG 60 

TGTGCTTCTG GAGGGTGAGA TTGAGACCTC TTACACCAAG GCCGACAACA GCGTCATTGT 120 

CGCAACCGAC TCCATTAAGA ACACCATTTA CATCACCGCC AAGCAGAACC CCGTTACTCC 180 

TCCCGAGCTG TTCGGCTCCA TCCTGGGCAC ACACTTCATT GAGAAGTACA ACCACATCCA 240 

TGCCGCTCAC GTCAACATTG TCTGCCACCG CTGGACCCGG ATGGACATTG ACGGCAAGCC 300 
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1021 
1022 
1023 
1024 
1025 
1026 
1027 
1028 
1029 
1030 
1031 
1032 
1033 
1034 
1035 
1036 
1037 
1038 
1039 
1040 
1041 
1042 
1043 
1044 
1045 
1046 
1047 
1048 
1049 
1050 
1051 
1052 
1053 
1054 
1055 
1056 
1057 
1058 
1059 
1060 
1061 
1062 
1063 
1064 
1065 
1066 
1067 
1068 
1069 
1070 
1071 



AATGTGCAGG 
CTGACCGTGC 
ACACTTAAGG 
AAGAATTTCA 
GCCACTGCTC 
GCCACTATGT 
GAGTACTCGT 
CAAAACACCG 
AAGTGTACCG 
TTCCGGAGTT 
TGTTTTTTAC 
AAAAAAGGGC 



TGGACGTGGT 
TGAAGAGCAC 
AGACCTGGGA 
GTGGACTCCA 
GCGAGGTCAC 
ACAAGATGGC 
TGCCTAACAA 
GCAAGAACGC 
TCGGCCGGTC 
TCCAAGGCAA 
TTCCAAAAAA 
CCG 



ACACCCTCAC TCCTTCATCC GCGACAGCGA GGAGAAGCGG 

CGAGGGCAAG GGCATCGATA TCAAGTCGTC TCTGTCCGGC 

CAACTCGCAG TTCTGGGGCT TCCTGCGTGA CGAGTACACC 

CCGTATCCTG AGCACCGACG TCGATGCCAC TTGGCAGTGG 

GGAGGTCCGC TCGCACGTGC CTAAGTTCGA TGCTACCTGG 

TCTGAAGACT TTTGCTGAAG ATAACAGTGC CAGCGTGCAG 

AGAGCAAATC CTGGCGCGCC AGCAGCTGAT CGAGACTGTC 

GCACTATTTC GAAATCGACC TGAGCTGGCA CAAGGGCCTC 

CGAGGTCTTC GCTCCTCAGT CGGACCCCAA CGGTCTGATC 

CTCTCTGAAG TCTAAATTGT AAACCAACAT GATTCTCACG 

ACTGTATATA GTCTGGGATA GGGTATAGCA TTCATTCACT 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 207 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Synthetic GAL 7 fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CGCGTCTATA CTTCGGAGCA CTGTTGAGCG AAGGCTCATT AGATATATTT TCTGTCATTT 
TCCTTAACCC AAAAATAAGG GAGAGGGTCC AAAAAGCGCT CGGACAACTG TTGACCGTGA 
TCCGAAGGAC TGGCTATACA GTGTTCACAA AATAGCCAAG CTGAAAATAA TGTGTAGCCT 
TTAGCTATGT TCAGTTAGTT TGGCATG 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1013 



60 
120 
180 
207 
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1072 
1073 
1074 
1075 
1076 
1077 
1078 
1079 
1080 
1081 
1082 
1083 
1084 
1085 
1086 
1087 
1088 
1089 
1090 
1091 
1092 
1093 
1094 
1095 
1096 
1097 
1098 
1099 
1100 
1101 
1102 
1103 
1104 
1105 
1106 
1107 
1108 
1109 
1110 
1111 
1112 
1113 
1114 
1115 
1116 
1117 
1118 
1119 
1120 
1121 
1122 



(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Modified Xbal-Mlul adapter 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CTAGGCTAGC GGGCCCGCAT GCA 23 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 422 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Plasmid pSEl "site binding to Hindlll" 
fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

AGCTGGCTCG CATCTCTCCT TCACGCGCCC GCCGCCCTAC CTGAGGCCGC CATCCACGCC 60 

GGTGAGTCGC GTTCTGCCGC CTCCCGCCTG TGGTGCCTCC TGAACTGCGT CCGCCGTCTA 120 

GGTAGGCTCC AAGGGAGCCG GACAAAGGCC CGGTCTCGAC CTGAGCTCTA AACTTACCTA 180 

GACTCAGCCG GCTCTCCACG CTTTGCCTGA CCCTGCTTGC TCAACTCTAC GTCTTTGTTT 240 

CGTTTTCTGT TCTGCGCCGT TACAACTTCA AGGTATGCGC TGGGACCTGG CAGGCGGCAT 300 
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1123 CTGGGACCCC TAGGAAGGGC TTGGGGGTCC TCGTGCCCAA GGCAGGGAAC ATAGTGGTCC 360 
1124 

1125 CAGGAAGGGG AGCAGAGGCA TCAGGGTGTC CACTTTGTCT CCGCAGCTCC TGAGCCTGCA 420 
1126 

1127 GA 422 
1128 

1129 (2) INFORMATION FOR SEQ ID NO:34: 
1130 

1131 (i) SEQUENCE CHARACTERISTICS: 

1132 (A) LENGTH: 77 base pairs 

1133 (B) TYPE: nucleic acid 

1134 (C) STRANDEDNESS: double 

1135 (D) TOPOLOGY: linear 
1136 

1137 (ii) MOLECULE TYPE: DNA (genomic) 

1138 

1139 (iii) HYPOTHETICAL: NO 

1140 

1141 

1142 (vii) IMMEDIATE SOURCE: 

1143 (B) CLONE: Synthetic HindIII-"site binding to BamHI" 

1144 fragment 
1145 

1146 

1147 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

1148 

1149 AGCTTGTCGA CTAATACGAC TCACTATAGG GCGGCCGCGG GCCCCTGCAG GAATTCGGAT 60 
1150 

1151 CCCCCGGGTG ACTGACT 77 
1152 

1153 (2) INFORMATION FOR SEQ ID NO: 35: 
1154 

1155 (i) SEQUENCE CHARACTERISTICS: 

1156 (A) LENGTH: 61 base pairs 

1157 (B) TYPE: nucleic acid 

1158 (C) STRANDEDNESS: double 

1159 (D) TOPOLOGY: linear 
1160 

1161 (ii) MOLECULE TYPE: DNA (genomic) 

1162 

1163 (iii) HYPOTHETICAL: NO 

1164 

1165 

1166 (vii) IMMEDIATE SOURCE: 

1167 (B) CLONE: Synthetic Hindlll-AccI fragment 
1168 

1169 

1170 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

1171 

1172 AGCTTGCCGC CACTATGTCC GCAGTAAAAG CAGCCCGCTA CGGCAAGGAC AATGTCCGCG 60 
1173 
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1174 
1175 
1176 
1177 
1178 
1179 
1180 
1181 
1182 
1183 
1184 
1185 
1186 
1187 
1188 
1189 
1190 
1191 
1192 
1193 
1194 
1195 
1196 
1197 
1198 
1199 
1200 
1201 
1202 
1203 
1204 
1205 
1206 
1207 
1208 
1209 
1210 
1211 
1212 
1213 
1214 
1215 
1216 
1217 
1218 
1219 
1220 
1221 
1222 
1223 
1224 



61 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 920 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Hindi I I-SnaBI fragment 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

AGCTTGCCGC CACTATGTCC GCAGTAAAAG CAGCCCGCTA CGGCAAGGAC AATGTCCGCG 

TCTACAAGGT TCACAAGGAC GAGAAGACCG GTGTCCAGAC GGTGTACGAG ATGACCGTCT 

GTGTGCTTCT GGAGGGTGAG ATTGAGACCT CTTACACCAA GGCCGACAAC AGCGTCATTG 

TCGCAACCGA CTCCATTAAG AACACCATTT ACATCACCGC CAAGCAGAAC CCCGTTACTC 

CTCCCGAGCT GTTCGGCTCC ATCCTGGGCA CACACTTCAT TGAGAAGTAC AACCACATCC 

ATGCCGCTCA CGTCAACATT GTCTGCCACC GCTGGACCCG GATGGACATT GACGGCAAGC 

CACACCCTCA CTCCTTCATC CGCGACAGCG AGGAGAAGCG GAATGTGCAG GTGGACGTGG 

TCGAGGGCAA GGGCATCGAT ATCAAGTCGT CTCTGTCCGG CCTGACCGTG CTGAAGAGCA 

CCAACTCGCA GTTCTGGGGC TTCCTGCGTG ACGAGTACAC CACACTTAAG GAGACCTGGG 

ACCGTATCCT GAGCACCGAC GTCGATGCCA CTTGGCAGTG GAAGAATTTC AGTGGACTCC 

AGGAGGTCCG CTCGCACGTG CCTAAGTTCG ATGCTACCTG GGCCACTGCT CGCGAGGTCA 

CTCTGAAGAC TTTTGCTGAA GATAACAGTG CCAGCGTGCA GGCCACTATG TACAAGATGG 

CAGAGCAAAT CCTGGCGCGC CAGCAGCTGA TCGAGACTGT CGAGTACTCG TTGCCTAACA 

AGCACTATTT CGAAATCGAC CTGAGCTGGC ACAAGGGCCT CCAAAACACC GGCAAGAACG 

CCGAGGTCTT CGCTCCTCAG TCGGACCCCA ACGGTCTGAT CAAGTGTACC GTCGGCCGGT 
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Line Error Original Text 

37 Wrong application Serial Number (A) APPLICATION NUMBER: US/07/920,519 
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